Detection and Classification of Interacting Persons
نویسندگان
چکیده
This chapter presents a way to classify interactions between people. Examples of the interactions we investigate are; people meeting one another, walking together and fighting. A new feature set is proposed along with a corresponding classification method. Results are presented which show the new method performing significantly better than the previous state of the art method as proposed by [Oliver et al., 2000]. INTRODUCTION This chapter presents an investigation into classification of multiple person interactions. There has been much previous work upon identifying what activity individual people are engaged in. [Davis and Bobick, 2001] used a moment based representation based on extracted silhouettes and [Efros et al., 2003] modeled human activity by generating optical flow descriptions of a person’s action. Descriptions were generated by first hand-tracking an individual, re-scaling to a standard size and then taking the optical flow of a persons actions over several frames. A database of these descriptions was created and matched to novel situations. This method was extended by [Robertson, 2006] who also included location information to help give contextual information to a scene. Location information is of assistance when trying to determine if someone is loitering or merely waiting at a road crossing. Following on from flow based features [Dollar et al., 2005] extracted spatial-temporal features to identify sequences of actions. Ribeiro and Santos-Victor [Ribeiro and Santos-Victor, 2005] took a different approach to classify an individual’s actions in that they used multiple features calculated from tracking (such as speed, eigenvectors of flow) and selected those features which best classified the persons actions using a classification tree with each branch using at most 3 features to classify the example. The classification of interacting individuals was studied by [Oliver et al., 2000] who used tracking to extract the speed, alignment and derivative of the distance between two individuals. This information was then used to classify sequences using a coupled hidden Markov model (CHMM). [Liu and Chua, 2006] expanded the two person classification to three person sequences using a hidden Markov model (HMM) with an explicit role attribute. Information derived from tracking was used to provide features such as the relative angle between two persons to classify complete sequences. Xiang and Gong [Gong and Xiang, 2003] again used a CHMM to model interactions between vehicles on an aircraft runway. These features are calculated by detecting significantly changed pixels over several frames. The correct model for representing the sequence is determined by the connections between the separate models. Goodness of fit is calculated by the Bayesian information criterion. Using this method a model representing the sequences actions is determined. Multi-person interactions within a rigid formation was also the goal of [Khan and Shah, 2005] who used a geometric model to detect rigid formations between people, such an example would be a marching band. [Intille and Bobick, 2001] used a pre-defined Bayesian network to describe planned motions during American football games. Others such as [Perse et al., 2007] also use a pre-specified template to evaluate the current action being performed by many individuals. Prespecified templates have been used by [Van Vu et al., 2003, Hongeng and Nevatia, 2001] within the context of surveillance applications. SPECIFICALLY WHAT ARE WE TRYING TO DO? Given an input video sequence the goal is to automatically determine if any interactions between two people are taking place. If any are taking place then we want to identify the class of the interaction. Here we limit ourselves to pre-defined classes which have been previously labeled. To make the situation more realistic there is also a ‘no interaction’ class. We seek to give each frame a label from a predefined set. For example a label may be that person 1 and person 2 are walking together in frame 56. The ability to automatically classify such interactions would be useful in cases which are typical of many surveillance situations. Such an ability to automatically recognize interactions would also be useful in video summarization where it could be possible to focus only on specific interactions. FEATURES AND VARIABLES Video data is rich in information with the resolution of modern surveillance cameras capable of delivering megapixel resolution at a sustained frame rate of greater than 10fps. Such data is overwhelming and mostly unnecessary for classification of interactions. As a first step, tracking of the individuals is employed. There is a rich body of work upon tracking of people and objects within the literature. See the review by Yilmaz et al. for a good survey of current tracking technology [Yilmaz, A et al. 2006]. For all experiments performed in this chapter the bounding box method of identifying a person was used. The positional information was calculated based upon the centre of this box. This process is illustrated in Figure 1. Such tracking information is typical of the output of many tracking procedures and it will be assumed that such a tracker is available throughout all experiments carried out in this chapter. Figure 1 Bounding box tracking. Colored lines show the previous position of the centre of the tracked object. Here three types of features are used as input to a classifier. We make use of movement, alignment and distance based features. More details are given in the following sections. Movement Based features Movement plays an important role in recognizing interactions. The speed of an individual is calculated as shown in equation (1.1). The double vertical bar (||..||) represents a vector L2 norm as given by 2 2 2 1 2 = sqrt( + +.....+ ) n x x x x where n x refers to the nth component of vector x. t t w i i t i s w − − = p p (1.1) Here i p refers to the position of the tracked object at time t for object i. Within this work only the two dimensional ( , t t t i i i x y ⎡ ⎤ = ⎣ ⎦ p ) case is considered due to tracking information only being available in two dimensions. The temporal offset w is introduced due to the high rates which typify many modern video cameras. With frame rates of around 25 frames per second this can mean that differences between the current and last frame (w=1) are very small and may be dominated by noise. The absolute difference in speed ( , t i j ε ) between two tracks is also calculated ( t t i j s s − ). The vorticity ( i t v ) is measured as a deviation from a line. This line is calculated by fitting a line to a set of previous positions of the trajectory ,.., t t w t i i i − ⎡ ⎤ = ⎣ ⎦ P p p . The window size w is the same as that used in equation (1.1). At each point the orthogonal distance to the line is found. The total distance of all points are then summed and normalized by window length and so give a measure of the vorticity. Alignment Based features The alignment of two tracks can give valuable information as to how they are interacting. The degree of alignment is common to [Gigerenzer et al., 1999; Oliver et al., 2000; and Liu and Chua, 2006] who all make use of such information when classifying trajectory information. To calculate the dot product the heading (h) of the object is taken as in equation (1.2) and the dot product (1.3) is calculated from the directions of tracks i and j. ˆ t t w t i i i t t w i i − − − = − p p h p p (1.2) , ˆ ˆ t t t i j i j a = ⋅ h h (1.3) In addition to the alignment between two people the potential intersection ( , i j t γ ) of two Trajectories is also calculated. Such features are suggested in [Gigerenzer et al., 1999] and [Liu and Chua, 2006]. We first test for an intersection of the headings. This is achieved as shown in Algorithm 1.
منابع مشابه
A hybridization of evolutionary fuzzy systems and ant Colony optimization for intrusion detection
A hybrid approach for intrusion detection in computer networks is presented in this paper. The proposed approach combines an evolutionary-based fuzzy system with an Ant Colony Optimization procedure to generate high-quality fuzzy-classification rules. We applied our hybrid learning approach to network security and validated it using the DARPA KDD-Cup99 benchmark data set. The results indicate t...
متن کاملRice Classification and Quality Detection Based on Sparse Coding Technique
Classification of various rice types and determination of its quality is a major issue in the scientific and commercial fields associated with modern agriculture. In recent years, various image processing techniques are used to identify different types of agricultural products. There are also various color and texture-based features in order to achieve the desired results in this area. In this ...
متن کاملFault Detection and Classification in Double-Circuit Transmission Line in Presence of TCSC Using Hybrid Intelligent Method
In this paper, an effective method for fault detection and classification in a double-circuit transmission line compensated with TCSC is proposed. The mutual coupling of parallel transmission lines and presence of TCSC affect the frequency content of the input signal of a distance relay and hence fault detection and fault classification face some challenges. One of the most effective methods fo...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملAutomatic road crack detection and classification using image processing techniques, machine learning and integrated models in urban areas: A novel image binarization technique
The quality of the road pavement has always been one of the major concerns for governments around the world. Cracks in the asphalt are one of the most common road tensions that generally threaten the safety of roads and highways. In recent years, automated inspection methods such as image and video processing have been considered due to the high cost and error of manual metho...
متن کاملDetection of Fake Accounts in Social Networks Based on One Class Classification
Detection of fake accounts on social networks is a challenging process. The previous methods in identification of fake accounts have not considered the strength of the users’ communications, hence reducing their efficiency. In this work, we are going to present a detection method based on the users’ similarities considering the network communications of the users. In the first step, similarity ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009